NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

DeepOHeat: Operator Learning-based Ultra-fast Thermal Simulation in 3D-IC Design

https://doi.org/10.1109/DAC56929.2023.10247998

Liu, Ziyue; Li, Yixing; Hu, Jing; Yu, Xinling; Shiau, Shinyu; Ai, Xin; Zeng, Zhiyu; Zhang, Zheng (July 2023, IEEE)

Full Text Available
Light-Weight RetinaNet for Object Detection on Edge Devices

https://doi.org/10.1109/WF-IoT48130.2020.9221150

Li, Yixing; Dua, Akshay; Ren, Fengbo (June 2020, IEEE 6th World Forum on Internet of Things (WF-IoT))
null (Ed.)
This paper aims at reducing computation for Retinanet, an mAP-30-tier network, to facilitate its practical deployment on edge devices for providing IoT-based object detection services. We first validate RetinaNet has the best FLOP-mAP trade-off among all mAP-30-tier network. Then, we propose a light-weight RetinaNet structure with effective computation- accuracy trade-off by only reducing FLOPs in computationally intensive layers. Compared with the most common way of trading off computation with accuracy-input image scaling, the proposed solution shows a consistently better FLOPs-mAP trade-off curve. Light-weight RetinaNet achieves a 0.3% mAP improvement at 1.8x FLOPs reduction point over the original RetinaNet, and gains 1.8x more energy-efficiency on an Intel Arria 10 FPGA accelerator in the context of edge computing. The proposed method potentially can help a wide range of the object detection applications to move closer to a preferred corner for a better runtime and accuracy, while enjoys more energy-efficient inference at the edge.
more » « less
Full Text Available
BNN Pruning: Pruning Binary Neural Network Guided by Weight Flipping Frequency

Li, Yixing; Ren, Fengbo (March 2020, International Symposium on Quality Electronic Design)

A binary neural network (BNN) is a compact form of neural network. Both the weights and activations in BNNs can be binary values, which leads to a significant reduction in both parameter size and computational complexity compared to their full-precision counterparts. Such reductions can directly translate into reduced memory footprint and computation cost in hardware, making BNNs highly suitable for a wide range of hardware accelerators. However, it is unclear whether and how a BNN can be further pruned for ultimate compactness. As both 0s and 1s are non-trivial in BNNs, it is not proper to adopt any existing pruning method of full- precision networks that interprets 0s as trivial. In this paper, we present a pruning method tailored to BNNs and illustrate that BNNs can be further pruned by using weight flipping frequency as an indicator of sensitivity to accuracy. The experiments performed on the binary versions of a 9- layer Network-in-Network (NIN) and the AlexNet with the CIFAR-10 dataset show that the proposed BNN-pruning method can achieve 20-40% reduction in binary operations with 0.5-1.0% accuracy drop, which leads to a 15-40% run- time speedup on a TitanX GPU.
more » « less
Full Text Available
Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing

https://doi.org/10.1109/FCCM48280.2020.00064

Dua, Akshay; Li, Yixing; Ren, Fengbo (May 2020, IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM))
null (Ed.)
Full Text Available
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning

Li, Yixing; Zhang, Shuai; Zhou, Xichuan; Ren, Fengbo (January 2020, Neurocomputing)

Due to the high computational complexity and memory storage requirement, it is hard to directly deploy a full-precision Convolutional neural network (CNN) on embedded devices. The hardware-friendly designs are needed for resource-limited and energy-constrained embedded devices. Emerging solutions are adopted for the neural network compression, e.g., binary/ternary weight network, pruned network and quantized network. Among them, binary neural network (BNN) is believed to be the most hardware-friendly framework due to its small network size and low computational complexity. No existing work has further shrunk the size of BNN. In this work, we explore the redundancy in BNN and build a compact BNN (CBNN) based on the bit-level sensitivity analysis and bit-level data pruning. The input data is converted to a high dimensional bit-sliced format. In the post-training stage, we analyze the impact of different bit slices to the accuracy. By pruning the redundant input bit slices and shrinking the network size, we are able to build a more compact BNN. Our result shows that we can further scale down the network size of the BNN up to 3.9x with no more than a 1% accuracy drop. The actual runtime can be reduced up to 2x and 9.9x compared with the baseline BNN and its full-precision counterpart, respectively.
more » « less
Full Text Available
A 34-FPS 698-GOP/s/W Binarized Deep Neural Network-based Natural Scene Text Interpretation Accelerator for Mobile Edge Computing

https://doi.org/10.1109/TIE.2018.2875643

Li, Yixing; Liu, Zichuan; Liu, Wenye; Jiang, Yu; Wang, Yongliang; Goh, Wang Ling; Yu, Hao; Ren, Fengbo (October 2018, IEEE Transactions on Industrial Electronics)

Full Text Available

Search for: All records